aoa_ratings <- read_xlsx(path = "../data/words_aoa_ratings.xlsx", sheet = 1)%>%
filter(Word %in% c("carrot","duck","bread","apple","kite","horseshoe","plug","garlic","barrel","eggplant","pawn","papaya"))%>%
mutate(mean_aoa = as.numeric(Rating.Mean),
item = Word)%>%
select(item,mean_aoa)
me_data <- read_csv("../data/me.csv")
prior_data <- read_csv("../data/novelty.csv")
comb_data <- read_csv("../data/combination.csv")%>%
mutate(model = "data")%>%
left_join(aoa_ratings) %>%
ungroup()%>%
mutate(item = fct_reorder(factor(item), mean_aoa))
First empirical studies. Then the general modelling framework and the pragmatic model in detail. Then we turn to results. First prediction (i.e. which model makes the best predictions about integration based on only the experiments for the individual inferences) then explanation, that is, if we use all the data that we have, how can we best explain what is happening. Here we fit the models to the integration data. We consider two hypothesis. First our pragmatic model in which the Me inference is conditional on the prior, second a mixture model in which the two inferences are computed separately and mixed by a certain ratio. We also explore a developmental mixture model in which we allow the mixture component to change with age.
The first experiment tested the so called mutual exclusivity inference in children between 2 and 5 years of age. The general phenomena is that when presented with a familiar and an unfamiliar object, children expect a novel word to refer to the unfamiliar object (e.g. Markman and Wachtel 1988). A range of explanations have been put forward for the cognitive basis of this inference (see Lewis et al. 2020 for a discussion). Here, we treat the mutual exclusivity inference as pragmatic (e.g. Clark 1987). The inference process is specified in the model below.
The first goal of this experiment was to quantify developmental change in the age range tested. The second goal of Experiment 1 was to test the role of semantic knowledge (cf. Lewis et al. 2020). The assumption is that the strength of the mutual exclusivity inference varies with knowledge of the word for the familiar object. That is, when the familiar object is an object for which children are less likely to know the word, they are less likely to assume that the novel word refers to the unfamiliar object. To test this, we systematically varied the familiar object that was presented with the novel object.
The experiment was preregistered at https://osf.io/gy37b. The experiment itself can be run by downloading the associated repository and opening the file experiments/kids/kids_me.html.
We tested a total number of 90 children, including 30 2-year-olds (range = 2.03 - 3.00, 15 girls), 30 3-year-olds (range = 3.03 - 3.97, 22 girls) and 30 4-year-olds (range = 4.03 - 4.90, 16 girls). Data from 10 additional children was not included because they were either exposed to less than 75% of English at home (5), did not finish at least half of the test trials (2), the technical equipment failed (2) or their parents reported an autism spectrum disorder (1). All children were recruited from the floor of a Children’s museum in San José, California, USA. This population is characterized by diverse ethnic background (predominantly White, Asian, or mixed ethnicity) and high levels of parental education and socioeconomic status. Parents consented to their children’s participation and provided demographic information. All experiments were approved by the Stanford Institutional Review Board (protocol no. 19960)
The experiment was presented as an interactive picture book on a tablet computer (Frank et al. 2016). Figure 1A shows the general setup. Children saw an animal between two tables. For each animal character, we recorded a set of utterances (one native English speaker per animal) that were used to make requests. Each experiment started with two training trials in which the speaker requested a known object (car and ball).
In Experiment 1, on one table, there was a familiar object, on the other table, there was a novel object (drawn for the purpose of the study). The speaker requested an object by saying “Oh cool, there is a [non-word] on the table, how neat, can you give me the [non-word]?”. Children responded by touching one of the objects. The location of the novel object (left or right table) and the animal character were counterbalanced. Each child received 12 trials, one with each familiar object. The novel object also changed from trial to trial. We coded as correct choice if children chose the novel object as the referent of the novel word.
Figure 1: Schematic experimental procedure with screenshots from the experiments.
Each child completed 12 trials, each with a different familiar and a different novel object. Familiar objects were selected to vary along the dimension of how likely children were to know the word for each object. This including objects that most 2-year-olds could name (e.g. a duck) as well as objects that only very few 5-year-olds could name (e.g. a pawn). The selection was based on age of acquisition ratings from Kuperman and colleagues (2012). While these ratings do not capture the absolute age when children acquire these words, they capture the relative order in which words are learned. Figure 2A shows the objects used in the experiment. We induced this variation to estimate the role of semantic knowledge in a mutual exclusivity inference.
chance_me <- me_data %>%
group_by(subage, subid) %>%
summarise(correct = mean(correct)) %>%
summarise(correct = list(correct)) %>%
group_by(subage)%>%
mutate(Mean= round(mean(unlist(correct)),2),
BayesFactor = format(round(extractBF(ttestBF(unlist(correct), mu = 0.5))$bf), scientific = F),
`Age group` = subage)%>%
ungroup()%>%
select(`Age group`, Mean, BayesFactor)
knitr::kable(chance_me, caption = "Proportion of children choosing the novel object compared to a level expected by chance based on a one sample Bayesian t-test. Responses are aggregated for each participant across familiar objects.", digits = 2)
| Age group | Mean | BayesFactor |
|---|---|---|
| 2 | 0.61 | 132 |
| 3 | 0.73 | 185881356 |
| 4 | 0.86 | 72514087738 |
As a first step, we evaluated whether children made a mutual exclusivity inference. For this analysis, we aggregated participants’ responses across familiar objects. We used the function ttestBF from the R-package BayesFactor (Morey and Rouder 2018) to compute a Bayes factor (BF) in favor of the hypothesis that children chose the novel object more often than expected by chance (50% correct). Table 1 shows that all age groups made the inference.
# prior_me <- c(prior(normal(0, 5), class = Intercept),
# prior(normal(0, 5), class = b),
# prior(cauchy(0, 1), class = sd))
#
#
# bm_me <- brm(correct ~ age + (1|subid) + (age | item) + (age | agent),
# data = me_data, family = bernoulli(),
# control = list(adapt_delta = 0.99, max_treedepth = 20),
# sample_prior = F,
# prior = prior_me,
# cores = 4,
# chains = 4,
# iter = 5000)%>%
# saveRDS(.,"../saves/bm_me.rds")
bm_me <- readRDS("../saves/bm_me.rds")
fixef_me <- as_tibble(fixef(bm_me), rownames = "term")
ranef_me <- ranef(bm_me)
ranef_plot_me <- as_tibble(ranef_me$item, rownames = "item")%>%
mutate(grand_intercept = fixef_me%>%filter(term=="Intercept")%>%pull(Estimate),
grand_slope = fixef_me%>%filter(term=="age")%>%pull(Estimate))%>%
group_by(item) %>%
tidyr::expand(Estimate.Intercept,Estimate.age,grand_intercept,grand_slope, age = unique(me_data$age))%>%
mutate(y = plogis(grand_intercept + Estimate.Intercept+(Estimate.age+grand_slope)*age),
age = age+min(me_data$age_num))%>%
left_join(aoa_ratings)%>%
ungroup()%>%
mutate(item = fct_reorder(factor(item), mean_aoa))
plot_me <- ggplot(ranef_plot_me, aes(x=age,y = y, col = item))+
geom_hline(yintercept = 0.5, lty=2)+
geom_jitter(data = me_data%>%left_join(aoa_ratings), aes(x = age_num, y = correct, col = reorder(item, mean_aoa)), width = 0, height = 0.02, alpha = .2)+
geom_line(size = 1)+
labs(x="Age",y="Mutual exclusivity effect")+
theme_few() +
ylim(-0.05,1.05)+
xlim(2,5)+
guides(alpha = F)+
scale_colour_viridis_d(name = "Object")
cor_plot_me <- as_tibble(ranef_me$item, rownames = "item") %>%
left_join(aoa_ratings)%>%
ggplot(., aes(x = mean_aoa, y = Estimate.Intercept))+
geom_abline(intercept = 1, slope = -1, lty = 2, alpha = 1, size = .5)+
geom_point(pch = 4, size = 2)+
geom_smooth(method = "lm", col = "black", se = F, lty = 2, size = .5)+
xlab("Rated age of acquisition")+
ylab("Mutual exclusivity effect (model intercept)")+
ylim(-1,1)+
stat_cor(method = "pearson", label.x = 7, label.y = .99)+
theme_few()
talk about sd around item as evidence for variation
As a second step, we investigated how the inference changed as a function of age and the familiar object. We modeled the trial by trial data using a Bayesian generalized linear mixed model (GLMM). We used the function brm from the package brms (Bürkner 2017). As priors we used normal(0,5) for fixed effects and cauchy(0,1) for standard deviations of random effects. The model formula was correct ~ age + (1 | id) + (age | object) + (age | agent). That is, we modeled an overall slope for age (continuous, anchored at the minimum) and the object specific developmental trajectories as deviations from the overall intercept and slope (random effects). The estimate for age was positive and reliably different from zero (\(\beta\) = 0.91, 95% CrI: 0.58 - 1.3). Older children were more likely to make a mutual exclusivity inference. Figure 2B visualizes the model based developmental trajectory for each familiar object and shows that there was substantial variation between them. Figure 2C shows the correlation between rated age of acquisition and object specific model intercept. The mutual exclusivity effect was stronger for words that were rated to be acquired earlier. Objects for which children were less likely to know the word produced a weaker mutual exclusivity effect. Taken together, the mutual exclusivity inference depended on age as well as the familiar object.
ggarrange(plot_aoa, plot_me, cor_plot_me, labels = c("A","B","C"), nrow = 1, widths = c(1,1.3,1))
Figure 2: A:Familiar words and corresponding pictures by rated age of acquisition. B: Developmental trajectories of mututal exclusivity effect by familiar object based on the mean of the model posterior distribution. Dots show individual datapoints. Lighter colors indicate later rated age of acquisition. Dotted line indicates a level of performance expected by chance. C: Correlation between rated age of acquisiton and mutual exclusivity effect (model based intercept for each familiar object).
Here we tested children’s sensitivity to common ground that is build up over the course of a conversation. In particular, we tested whether children keep track of which object is new to a speaker and which they have encountered previously (Akhtar, Carpenter, and Tomasello 1996; Diesendruck et al. 2004). The main goal of the experiment was to measure how children’s sensitivity to common ground changes with age.
The experiment was preregistered at https://osf.io/au5hr. The experiment itself can be run by downloading the associated repository and opening the file experiments/kids/kids_novel.html.
We tested 58 children from the same general population as in Experiment 1, including 18 2-year-olds (range = 2.02 - 2.93, 7 girls), 19 3-year-olds (range = 3.01 - 3.90, 14 girls) and 21 4-year-olds (range = 4.07 - 4.93, 14 girls). Data from 5 additional children was not included because they were either exposed to less than 75% of English at home (3) or the technical equipment failed (2).
The general setup was the same as in Experiment 1. The speaker was positioned between the tables. There was a novel object (drawn for the purpose of the study) on one of the tables while the other table was empty. Next, the speaker turned to one of the tables and either commented on the presence (“Aha, look at that.”) or the absence (“Hm, nothing there”) of an object. Then the speaker disappeared. While the speaker was away, a second novel object appeared on the previously empty table. Then the speaker returned and requested an object in the same way as in Experiment 1 (see also Figure 1B). The positioning of the novel object in the beginning of the experiment as well as the location the speaker turned to first was counterbalanced. Children received five trials, each with a different pair of novel objects. We coded as correct choice if children chose the object that was new to the speaker as the referent of the novel word.
chance_prior <- prior_data %>%
group_by(subage, subid) %>%
summarise(correct = mean(correct)) %>%
summarise(correct = list(correct)) %>%
group_by(subage)%>%
mutate(Mean= round(mean(unlist(correct)),2),
BayesFactor = format(round(extractBF(ttestBF(unlist(correct), mu = 0.5))$bf,2), scientific = F),
`Age group` = subage)%>%
ungroup()%>%
select(`Age group`, Mean, BayesFactor)
knitr::kable(chance_prior, caption = "Proportion of children choosing the object that was new to the speaker compared to a level expected by chance based on a one sample Bayesian t-test. Responses are aggregated for each participant across trials.", digits = 2)
| Age group | Mean | BayesFactor |
|---|---|---|
| 2 | 0.55 | 0.4 |
| 3 | 0.76 | 26.55 |
| 4 | 0.83 | 6956.06 |
Table 2 compares children’s correct responses to a level expected by chance (50%). We found evidence that, as a group, 3- and 4-year-olds, but not 2-year-olds, inferred that the novel word referred to the object that was new to the speaker.
# prior_cg <- c(prior(normal(0, 5), class = Intercept),
# prior(normal(0, 5), class = b),
# prior(cauchy(0, 1), class = sd))
#
#
# bm_cg <- brm(correct ~ age + (1|subid) + (age | agent),
# data = prior_data, family = bernoulli(),
# control = list(adapt_delta = 0.99, max_treedepth = 20),
# sample_prior = F,
# prior = prior_cg,
# cores = 4,
# chains = 4,
# iter = 5000)%>%
# saveRDS(.,"../saves/bm_cg.rds")
bm_cg <- readRDS("../saves/bm_cg.rds")
fixef_cg <- as_tibble(fixef(bm_cg), rownames = "term")
plot_cg_data <- prior_data %>%
group_by(age_num, subid) %>%
summarise(correct = mean(correct))
plot_cg_samples <- posterior_samples(bm_cg, "^b", subset = 1:200)%>%
mutate(sample = 1:length(b_age))%>%
expand_grid(.,unique(prior_data$age))%>%
mutate(age = `unique(prior_data$age)`,
y = plogis(b_Intercept + b_age * age))%>%
select(-`unique(prior_data$age)`)
plot_cg_map <- as_tibble(fixef(bm_cg), rownames = "term")%>%
select(term, Estimate)%>%
spread(term, Estimate)%>%
expand_grid(.,unique(prior_data$age))%>%
mutate(slope = age,
age = `unique(prior_data$age)`,
y = plogis(Intercept + slope * age))%>%
select(-`unique(prior_data$age)`)
plot_cg <- ggplot() +
geom_hline(yintercept = 1/2, lty=2, size = 1)+
geom_jitter(data = plot_cg_data,aes(x = age_num, y= correct), width = .00, height = .01, alpha = .5)+
geom_line(data = plot_cg_samples, aes(x = age+min(prior_data$age_num), y = y, group = sample), size = .025)+
geom_line(data = plot_cg_map, aes(x =age+min(prior_data$age_num), y = y), size = 1)+
labs(x="Age",y="Proportion object new to speaker chosen")+
theme_few() +
ylim(-0.05,1.05)+
xlim(2,5)+
guides(alpha = F)
To directly investigate whether children’s response changed with age, we modeled the trial by trial data using a Bayesian GLMM (formula: correct ~ age + (1 | id) + (age | speaker), specifications see Experiment 1). The estimate for age was positive and reliably different from zero (\(\beta\) = 0.92, 95% CrI: 0.37 - 1.54, see Figure 3A). Older children were more likely to chose the object that was new to the speaker as the referent of the novel word, suggesting that the sensitivity to common ground in this context increases with age.
Experiment 3 combined the procedures from Experiment 1 and 2. As a consequence, children had to consider not just their semantic knowledge of the word for the familiar object and the inference this licences but also the role that each object (novel and familiar) had played in the preceding interaction. Combining the two procedures created two conditions: In the congruent condition, the novel object was also the object that was new to the speaker. In this case, the mutual exclusivity inference as well as the common ground inference pointed to the novel object as the referent. In the incongurent condition, the familiar object was new to the speaker. Int his case, the two inferences pointed to different objects. The main focus of the overall study was to model how children integrate and balance these different information sources. We investigate this question in depth in the modelling section below. Here, we limit the discussion to whether children differentiated between the two conditions.
The experiment was preregistered at https://osf.io/4nm8g. The experiment itself can be run by downloading the associated repository and opening the file experiments/kids/kids_combination.html.
We tested 220 children from the same general population as in Experiment 1 and 2, including 76 2-year-olds (range = 2.04 - 2.99, 7 girls), 72 3-year-olds (range = 3.00 - 3.98, 14 girls) and 72 4-year-olds (range = 4.00 - 4.94, 14 girls). Data from 20 additional children was not included because they were either exposed to less than 75% of English at home (15), did not finish at least half of the test trials (3) or the technical equipment failed (2).
Experiment 3 followed the same procedure as Experiment 2 but involved the same objects as Experiment 1. In the beginning, one table was empty while there was an object (novel or familiar) on the other one. After commenting on the presence or absence of an object on each table, the speaker disappeared and a second object appeared (familiar or novel). Next, the speaker re-appeared and made the usual request.
In the congruent condition, the familiar object was present in the beginning and the novel object appeared while the speaker was away (Figure 1C - left). In this case, both the mutual exclusivity and the common ground inference pointed to the novel object as the referent. In the incongruent condition, the novel object was present in the beginning and the familiar object appeared later. In this case, the two inferences pointed to different objects (Figure 1C - right).
Participants received up to 12 test trials, six in each condition, each with a different familiar and novel object. Familiar objects were the same as in Experiment 1. The positioning of the objects on the tables and the location the speaker first turned to were counterbalanced. Participants could stop the experiment after six trials (three per condition). If a participant stopped after half of the trials, we tested an additional participant to reach a pre-registered number of data points per cell.
All results are reported from the perspective of the mutual exclusivity inference (correct in the model formula below). In the incongruent condition, high proportions speak to a mutual exclusivity inference and low proportion for a common ground inference. In the congruent condition, both inferences pointed in the same direction. The focus of this experiment was on information integration and we therefore did not compare the performance to chance.
# prior_comb <- c(prior(normal(0, 5), class = Intercept),
# prior(normal(0, 5), class = b),
# prior(cauchy(0, 1), class = sd))
#
# bm_comb <- brm(correct ~ age * alignment + (alignment | subid) + (age * alignment | item)+ (age * alignment | agent),
# data = comb_data, family = bernoulli(),
# control = list(adapt_delta = 0.99, max_treedepth = 20),
# sample_prior = F,
# prior = prior_comb,
# cores = 4,
# chains = 4,
# inits = 0,
# iter = 5000)%>%
# saveRDS(.,"../saves/bm_comb.rds")
bm_comb <- readRDS("../saves/bm_comb.rds")
fixef_comb <- as_tibble(fixef(bm_comb), rownames = "term")
ranef_comb <- ranef(bm_comb)
plot_comb_map <- bind_rows(
as_tibble(ranef_comb$item, rownames = "item")%>%
mutate(condition = "congruent",
condition_code = 0),
as_tibble(ranef_comb$item, rownames = "item")%>%
mutate(condition = "incongruent",
condition_code = 1))%>%
mutate(grand_intercept = fixef_comb%>%filter(term=="Intercept")%>%pull(Estimate),
grand_age = fixef_comb%>%filter(term=="age")%>%pull(Estimate),
grand_cond = fixef_comb%>%filter(term=="alignmentincongruent")%>%pull(Estimate),
grand_intact = fixef_comb%>%filter(term=="age:alignmentincongruent")%>%pull(Estimate))%>%
group_by(item, condition, condition_code)%>%
expand_grid(. ,age = unique(comb_data$age))%>%
mutate(y = plogis(grand_intercept +
Estimate.Intercept +
grand_cond * condition_code +
Estimate.alignmentincongruent * condition_code +
grand_age * age +
Estimate.age * age +
grand_intact * (condition_code * age) +
`Estimate.age:alignmentincongruent` * (condition_code * age)),
age = age+min(comb_data$age_num))%>%
left_join(aoa_ratings)%>%
ungroup()%>%
mutate(item = fct_reorder(factor(item), mean_aoa))
plot_comb <- ggplot(plot_comb_map, aes(x=age,y = y, col = item))+
geom_hline(yintercept = 0.5, lty=2)+
geom_jitter(data = comb_data%>%left_join(aoa_ratings)%>%ungroup()%>%mutate(item = fct_reorder(factor(item), mean_aoa)), aes(x = age_num, y= correct, col = item), width = .00, height = .04, alpha = .2)+
geom_line(size = 1)+
labs(x="Age",y="Mutual Exclusivity effect")+
facet_grid(~condition)+
theme_few() +
ylim(-0.05,1.05)+
xlim(2,5)+
guides(alpha = F)+
scale_colour_viridis_d(name = "Object")
Use sd on item to say tha items vary substantially and sd on interaction or age to say they differ in their slope
We modeled the trial by trial data in the following way: correct ~ age * alignment + (alignment | subid) + (age * alignment | item) + (age * alignment | agent). We pre-registered to include item as a fixed effect in Experiment 3. The corresponding model was too complex to be constrained by the data. Furthermore, as explained in Experiment 1, items were chosen based on their rated age of acquisition. That is, we assumed that they are not necessarily different kinds but that they represent different locations on a distribution of required semantic knowledge. For further model specifications see Experiment 1). The estimate for age was reliably positive (\(\beta\) = 0.81, 95% CrI: 0.4 - 1.24). The incongruent condition had a strong negative impact (\(\beta\) = -1.35, 95% CrI: -2.17 - -0.55), showing that children differentiated between the two conditions. The interaction term was weakly - though not entirely - negative, suggesting a shallower slope for age in the incongruent condition (\(\beta\) = -0.2, 95% CrI: -0.66 - 0.27). Figure 3B visualizes the model. Taken together, the results show that children responded to the way the two inferences were aligned with one another. For the remainder of the study, we address the question of how the two inferences might have interacted with one another.
ggarrange(plot_cg, plot_comb, labels = c("A","B"), nrow = 1, widths = c(1,2))
Figure 3: Proportion of choosing the object that was new to the speaker by age. Dots show the mean response for each participant. The solid black line shows the developmental trajectory based on the mean of the model posterior distribution. Lighter lines show 200 random draws from the posterior distribution to depict uncertainty. Dotted line indicates a level of performance expected by chance.
The main purpose of the study was to study how children integrate different information sources during word learning. To address this question, we use Bayesian cognitive models of prgamatic reasoning. We first describe an integration model which we think best represents the inference and integration processes. We then specify how this model captures developmental change.
Next, we ask how well this model predicts how children integrate the two information sources. That is, in a situation in which we know the development trajectories for the mutual exclusivity inference as well as for the common ground inference, what can we say about how they will be comined. Importantly, we ask this question before any data on integration has been collected. The model offers an answer and also allows us to make quantitative predictions. We can then test the predictive power of the model by comparing the model predictions to newly collected data. We will use model comparisons to test the integration model against a range of alternative models.
Finally, we ask how well our model explains the way that children integrate the two information sources. For this analysis, we fit the free parameters in the model to all the available data, those from experiment 1 and 2 as well as the integration data from experiment 3. We then compare the model to a range of alternative models that make different assumptions about how information is integrated and how this process develops. This approach answers the question of how we can best explain how children are integrating the different information sources.
The cognitive models are situated in the Rational Speech Act (RSA) framework [Frank and Goodman (2012); goodman2016pragmatic]. RSA models are models of pragmatic reasoning in that they treat language understanding as a special case of Bayesian social reasoning. A listener interprets an utterance by assuming it was produced by a cooperative speaker who had the goal to be informative. Being informative is defined as providing a message that would increase the probability of the listener inferring the speaker’s intended message. This notion of contextual informativeness captures the Gricean idea of cooperation between speaker and listener.
The model captures the following process. A listener is reasoning about the referent of a speaker’s utterance while at the same time trying to learn a lexicon (object–word mappings). This reasoning is contextualized by the prior probability of each referent. This prior probability is thought to be a function of the common ground shared between speaker and listener in that interacting around the objects changes the probability that they will be referred to later. We assume that the degree to which interactions around objects change their prior probability depend on the child’s age.
To decide between referents, the listener reasons about what a rational speaker would say given an intended referent. This speaker is assumed to compute the informativity of for each available utterance and then choose the most informative one. However, this expectation of speaker informativeness may vary and is caputred by the term alpha. In particualr, we take alpha to be a function of the child’s age.
The informativity of each utterance is given by imagining which referent a literal listener, who interprets words according to their literal semantics, would infer upon hearing it. Thus, this reasoning depends what kind of word–object mappings the speaker thinks the literal listener knows. We assume that knowing the literal semantics of a word is not deterministic but probabilistic. That is, for each object involved there is a probability p that the literal listener knows the word for it. For each of the novel objects, this semantic knowledge is 0. For familiar objects, it depends on the kind of object as well as on the child’s age.
The model description above points to three potential loci of developmental change: semantic knowledge, expectations about speaker informativeness and common ground sensitivity. Each of theses components is represented by a parameter in the model. We capture developmental change by making these parameters a function of age and therfor estimating a developmental trajectory (intercept and slope) for each parameter.
Semantic knowledge captures the degree of certainty with which the naive listener is assumed to know the label for the familiar object. When faced with the task, we think that children take their own semantic knowledge as the basis. As a consequence, semantic knowledge differs between familiar objects. For objects whose labels are generally acquired earlier (e.g. carrot) semantic knowledge is high whereas for others (e.g. pawn) semantic knowledge is lower. However, semantic knowledge also varies with age in that older children are more likely to know the labels for more of the familiar objects compared to younger children. As a consequence, each familiar object has a unique developmental trajectory with respect to semantic knowledge. The likelihood term of the model depends not just the parameter settings for semantic knowledge but also on the value of the parameter capturing expecations about speaker informativeness (alpha - see next section). As a consequence, these two parameters are estimated in conjunction and co-vary with one another.
A second locus of developmental change are expectations about speaker informativeness. In the context of the model, speaker informativeness corresponds to the degree with which the listener expects the speaker to choose the most informative of all available utterances. We assume that children at different ages might have different expectations about how rational or informative speakers are. As mentioned above, this parameter jointly estimated with the parameters for semantic knowledge.
Sensitivity to common ground refers to the probability that an object is taken to be the referent of the utterance before actually hearing the utterance. Thus, it captures the salience of an object due to its role in the social interaction that preceeds the utterance. We expect children at different ages to respond differently to the common ground manipulation, resulting in an age specific prior distribution over objects.
All Bayesian cognitive models were implemented in the probabilistic programming language WebPPL (Goodman and Stuhlmüller 2014). The corresponding model code can be found in the associated online repository and includes information about the prior distributions for all parameters in the model (file xxxxx). To generate model predictions, we estimated age sensitive parameter distributions for semantic knowledge (by item), speaker informativeness and common ground sensitivity and then passed them through the model in line with the different ways in which they can be combined and aligned. The resulting predictions come in the form of distributions of developmental trajectories for each item in the congruent and the incongruent condition. We generated model predictions for each model based on the full posterior distribution for each model parameter. Based on these model predictions, we computed the marginal likelihood of the data given each model and compared them via Bayes factors (see file model_comparison.Rmd in the associated online repository).
In this section we evaluate different models in terms of how well they predict information integration. That is, in a situation in which we know the development of the mutual exclusivity inference as well as the common ground inference, which model best predicts what happens when the two are combined (combination data from Experiment 3). Asking this question automatically excludes all models which include parameters that need to be fit to the combination data itself. To this end, we estimated the model parameters for semantic knowledge and speaker informativeness based on Experiment 1 and the parameter for common ground sensitivity based on Experiment 2. Next, we combined the parameters according to the four models described below. Please note that the parameter distributions were the same for all models (see Figure 4) and that models only differed in how they were combined.
To estimate the parameter distributions, we collected samples from six independent MCMC chains, collecting 150 000 samples from each chain and removing the first 50 000 for burn-in. We removed samples from one chain because it converged on a local maximum and yielded parameter distributions that were substantially different from the other chains. The model outputs can be found in the following online repository: git large file storage.
# parameter distributions based on model
item_params <- readRDS("../saves/item_params.rds")
global_params <- readRDS("../saves/global_params.rds")
item_sigma <- readRDS("../saves/item_sigma.rds")
# summaries
item_params_summary <- item_params %>%
group_by(item, parameter)%>%
summarise(mode = estimate_mode(value),
uci = hdi_upper(value),
lci = hdi_lower(value))
global_params_summary <- global_params %>%
group_by (parameter,type)%>%
summarise(mode = estimate_mode(value),
uci = hdi_upper(value),
lci = hdi_lower(value))
# selection object
select <- sample(1:length(unique(global_params$iteration)), 30)
# plot for semantic knowledge
sem_know_map <- item_params_summary%>%
select(-uci,-lci)%>%
spread(parameter, -item) %>%
expand_grid(., age = unique(me_data$age)) %>%
mutate(sem_know = plogis(intercept + slope * age)) %>%
left_join(aoa_ratings) %>%
ungroup()%>%
mutate(item = fct_reorder(factor(item), mean_aoa))
plot_sem_know <- ggplot()+
geom_line(data = sem_know_map, aes(x = age+2, y= sem_know, col = item, group = item),size = 1)+
ylab("Semantic knowledge")+
xlab("Age")+
ylim(0,1)+
theme_few()+
scale_colour_viridis_d(name = "Object")
# plot or speaker optimality
speak_opt <- global_params%>%
filter(parameter == "speaker_optimality")
speak_opt_line <- speak_opt%>%
filter(iteration %in% select)%>%
spread(type, value)%>%
expand_grid(., age = unique(me_data$age))%>%
mutate(y = intercept + slope * age)
speak_opt_map <- global_params_summary%>%
ungroup()%>%
filter(parameter == "speaker_optimality")%>%
select(type, mode)%>%
spread(type, mode)%>%
expand_grid(., age = unique(me_data$age))%>%
mutate(y = intercept + slope * age)
plot_speak_opt <- ggplot() +
geom_line(data = speak_opt_line, aes(x=age+2,y = y, group = interaction(chain,iteration)), size = .2, alpha = .1)+
geom_line(data = speak_opt_map, aes(x=age+2,y = y),size = 1)+
labs(x="Age",y="Speaker informativeness")+
theme_few() +
ylim(0,10)+
xlim(2,5)+
guides(alpha = F, fill = F, col = F)
# Plot prior sensitivity
prior <- global_params%>%
filter(parameter == "prior")
prior_line <- prior%>%
filter(iteration %in% select)%>%
spread(type, value)%>%
expand_grid(., age = unique(me_data$age))%>%
mutate(y = plogis(intercept + slope * age))
prior_map <- global_params_summary%>%
ungroup()%>%
filter(parameter == "prior")%>%
select(type, mode)%>%
spread(type, mode)%>%
expand_grid(., age = unique(me_data$age))%>%
mutate(y = plogis(intercept + slope * age))
plot_prior <- ggplot() +
geom_line(data = prior_line, aes(x=age+2,y = y, group = interaction(chain,iteration)), size = .2, alpha = .1)+
geom_line(data = prior_map, aes(x=age+2,y = y),size = 1)+
labs(x="Age",y="Prior sensitivity")+
theme_few() +
ylim(0,1)+
xlim(2,5)+
guides(alpha = F, fill = F, col = F)
ggarrange(plot_sem_know, plot_speak_opt, plot_prior , labels = c("A","B","C"), nrow = 1, widths = c(1.1,1,1))
Figure 4: Developmental trajectories for model parameters based on the posterior distribution for (A) semantic knowlede, (B) speaker informativeness and (C) prior sensitivity. Solid lines in show the MAP estimate for each parameter. Lighter lines in (B) and (C) show 300 random draws from the posterior distributon to visualize uncertainty. (A) does not include these random draws for the sake of clarity.
[overview fig, all model parameters for all models]
# comb_data_binned <- comb_data%>%
# group_by(subage, alignment, item)%>%
# summarize(k = sum(correct), n = n())%>%
# ungroup() %>%
# mutate(a = 1 + k,
# b = 1 + n - k,
# data_mean = (a-1)/(a+b-2),
# data_ci_lower = qbeta(.025, a, b),
# data_ci_upper = qbeta(.975, a, b),
# age = factor(subage))%>%
# select(-a,-b,-n,-k)%>%
# left_join(aoa_ratings) %>%
# ungroup()%>%
# mutate(item = fct_reorder(factor(item), mean_aoa))
#
# select_pred <- sample(1:20000, 60)
#
# plot_prag_pred <- readRDS("../saves/model_pred_prag.rds")%>%
# filter(iteration %in% select_pred)%>%
# left_join(aoa_ratings) %>%
# ungroup()%>%
# mutate(item = fct_reorder(factor(item), mean_aoa))
#
#
# saveRDS(plot_prag_pred, "../saves/plot_prag_pred.rds")
plot_prag_pred <- readRDS( "../saves/plot_prag_pred.rds")
ggplot(data = comb_data, aes(x = age_num, y = correct)) +
geom_hline(yintercept = 0.5, lty=2)+
geom_jitter(col = "black", height = .025, alpha = .1)+
#geom_pointrange(data = comb_data_binned, aes(x = subage+.5, y = data_mean, ymin =data_ci_lower, ymax = data_ci_upper), stroke = 1, pch = 5)+
#geom_line(data = comb_data_binned, aes(x = subage+.5, y = data_mean))+
geom_line(data = plot_prag_pred, aes(x=age+2,y = pred, col = alignment, group = interaction(chain,iteration)), size = .2, alpha = .25)+
geom_smooth(data = comb_data, aes(x = age_num, y = correct), col = "black", method = "glm", method.args = list(family = "binomial"), se = T, alpha = .4, lty = 2, size = 1)+
labs(x="Age",y="Mutual Exclusivity effect")+
facet_grid(alignment~item)+
theme_few() +
ylim(-0.05,1.05)+
xlim(2,5)+
guides(alpha = F, fill = F, col = F)+
scale_colour_ptol(name = NULL)
Figure 5: Model predictions based on the integration model. Colored lines show developmental trajectories for each familiar object and condition based on 300 random draws from the model posterior distribution. Top row (blue) shows the congruent condition and the bottom row (red) shows the inconguent condition. Familiar objects are ordered based on their rated age of acquisition (left o right). Dashed black lines show smoothed conditional mean of the data with 95% CI (in grey).
# cor_prag <- readRDS("../saves/model_pred_prag.rds")%>%
# mutate(age = as.numeric(age) + min(data_comb$age_num),
# age = cut(age,
# breaks = c(2,3,4,5),
# labels = c(2,3,4)))%>%
# group_by(model,alignment, item, age)%>%
# summarise(model_mean = estimate_mode(pred),
# model_ci_lower = hdi_lower(pred),
# model_ci_upper = hdi_upper(pred))
#
# cor_global <- readRDS("../saves/model_pred_global.rds")%>%
# mutate(age = as.numeric(age) + min(data_comb$age_num),
# age = cut(age,
# breaks = c(2,3,4,5),
# labels = c(2,3,4)))%>%
# group_by(model, alignment, item, age)%>%
# summarise(model_mean = estimate_mode(pred),
# model_ci_lower = hdi_lower(pred),
# model_ci_upper = hdi_upper(pred))
#
# cor_flat <- readRDS("../saves/model_pred_flat.rds")%>%
# mutate(age = as.numeric(age) + min(data_comb$age_num),
# age = cut(age,
# breaks = c(2,3,4,5),
# labels = c(2,3,4)))%>%
# group_by(model, alignment, item, age)%>%
# summarise(model_mean = estimate_mode(pred),
# model_ci_lower = hdi_lower(pred),
# model_ci_upper = hdi_upper(pred))
#
# cor_prior <- readRDS("../saves/model_pred_prior.rds")%>%
# mutate(age = as.numeric(age) + min(data_comb$age_num),
# age = cut(age,
# breaks = c(2,3,4,5),
# labels = c(2,3,4)))%>%
# group_by(model, alignment, item, age)%>%
# summarise(model_mean = estimate_mode(pred),
# model_ci_lower = hdi_lower(pred),
# model_ci_upper = hdi_upper(pred))
#
# cor_model_pred <- bind_rows(
# cor_prag,
# cor_global,
# cor_flat,
# cor_prior
# )
#
#
# plot_cor_model_pred <- cor_model_pred %>%
# left_join(
# bind_rows(
# binnned_data%>%mutate(model = "integration"),
# binnned_data%>%mutate(model = "no word knowledge"),
# binnned_data%>%mutate(model = "no common ground"),
# binnned_data%>%mutate(model = "no mutual exclusivity")
# )
# )%>%
# mutate(age = fct_recode(age,
# "2" = "2-year-olds",
# "3" = "3-year-olds",
# "4" = "4-year-olds"))
#
#saveRDS(plot_cor_model_pred, "../saves/corr_model_data_pred.rds")
plot_cor_model_pred <- readRDS( "../saves/corr_model_data_pred.rds")
ggplot(data = plot_cor_model_pred,aes(x = model_mean, y = data_mean, col = alignment)) +
geom_abline(intercept = 0, slope = 1, lty = 2, alpha = 1, size = .5)+
geom_errorbar(aes(ymin = data_ci_lower, ymax = data_ci_upper),width = 0,size = .5, alpha = .7)+
geom_errorbarh(aes(xmin = model_ci_lower, xmax = model_ci_upper), height = 0,size = .5, alpha = .7)+
geom_point(size = 1.5, stroke = 1, pch = 5)+
coord_fixed()+
stat_cor(method = "pearson", label.x = 0.01, label.y = 0.99, aes(x = model_mean, y = data_mean), inherit.aes = F, size = 3)+
xlim(0,1)+ylim(0,1)+
xlab("Model")+
ylab("Data")+
facet_grid(age~model)+
theme_few() +
scale_colour_ptol(name ="Condition")
Figure 6: Correlations between model predictions and data binned by year, item and condition. Vertical and horizontal error bars show 95% HDI. Blue diamonds show congruent condition and red ones show the incongruent condition.
semantic knowledge is only a function of age and not specific to the item. Roughly corresponds to. If children are vaguely familiar with an object, the make the ME inference regardless of the individual object
range of log like across chains and then model comparison
model parameters are also estimated based in the data from experiment 1 and 2, but then updated based on the data from experiment 3.
# parameter distributions based on model
item_params_exp <- readRDS("../saves/item_params_fb.rds")
global_params_exp <- readRDS("../saves/global_params_fb.rds")
item_sigma_exp <- readRDS("../saves/item_sigma_fb.rds")
# summaries
item_params_summary_exp <- item_params_exp %>%
group_by(item, parameter)%>%
summarise(mode = estimate_mode(value),
uci = hdi_upper(value),
lci = hdi_lower(value))
global_params_summary_exp <- global_params_exp %>%
group_by (parameter,type)%>%
summarise(mode = estimate_mode(value),
uci = hdi_upper(value),
lci = hdi_lower(value))
# selection object
select_exp <- sample(1:length(unique(global_params_exp$iteration)), 30)
# plot for semantic knowledge
sem_know_map_exp <- item_params_summary_exp%>%
select(-uci,-lci)%>%
spread(parameter, -item) %>%
expand_grid(., age = unique(comb_data$age)) %>%
mutate(sem_know = plogis(intercept + slope * age)) %>%
left_join(aoa_ratings) %>%
ungroup()%>%
mutate(item = fct_reorder(factor(item), mean_aoa))
plot_sem_know_exp <- ggplot()+
geom_line(data = sem_know_map_exp, aes(x = age+2, y= sem_know, col = item, group = item),size = 1)+
ylab("Semantic knowledge")+
xlab("Age")+
ylim(0,1)+
theme_few()+
scale_colour_viridis_d(name = "Object")
# plot or speaker optimality
speak_opt_exp <- global_params_exp%>%
filter(parameter == "speaker_optimality")
speak_opt_line_exp <- speak_opt_exp%>%
filter(iteration %in% select)%>%
spread(type, value)%>%
expand_grid(., age = unique(comb_data$age))%>%
mutate(y = intercept + slope * age)
speak_opt_map_exp <- global_params_summary_exp%>%
ungroup()%>%
filter(parameter == "speaker_optimality")%>%
select(type, mode)%>%
spread(type, mode)%>%
expand_grid(., age = unique(comb_data$age))%>%
mutate(y = intercept + slope * age)
plot_speak_opt_exp <- ggplot() +
geom_line(data = speak_opt_line_exp, aes(x=age+2,y = y, group = interaction(chain,iteration)), size = .2, alpha = .1)+
geom_line(data = speak_opt_map_exp, aes(x=age+2,y = y),size = 1)+
labs(x="Age",y="Speaker informativeness")+
theme_few() +
ylim(0,10)+
xlim(2,5)+
guides(alpha = F, fill = F, col = F)
# Plot prior sensitivity
prior_exp <- global_params_exp%>%
filter(parameter == "prior")
prior_line_exp <- prior_exp%>%
filter(iteration %in% select)%>%
spread(type, value)%>%
expand_grid(., age = unique(comb_data$age))%>%
mutate(y = plogis(intercept + slope * age))
prior_map_exp <- global_params_summary_exp%>%
ungroup()%>%
filter(parameter == "prior")%>%
select(type, mode)%>%
spread(type, mode)%>%
expand_grid(., age = unique(comb_data$age))%>%
mutate(y = plogis(intercept + slope * age))
plot_prior_exp <- ggplot() +
geom_line(data = prior_line_exp, aes(x=age+2,y = y, group = interaction(chain,iteration)), size = .2, alpha = .1)+
geom_line(data = prior_map_exp, aes(x=age+2,y = y),size = 1)+
labs(x="Age",y="Prior sensitivity")+
theme_few() +
ylim(0,1)+
xlim(2,5)+
guides(alpha = F, fill = F, col = F)
ggarrange(plot_sem_know_exp, plot_speak_opt_exp, plot_prior_exp , labels = c("A","B","C"), nrow = 1, widths = c(1.1,1,1))
Figure 7: Developmental trajectories for model parameters based on the posterior distribution for (A) semantic knowlede, (B) speaker informativeness and (C) prior sensitivity. Solid lines in show the MAP estimate for each parameter. Lighter lines in (B) and (C) show 300 random draws from the posterior distributon to visualize uncertainty. (A) does not include these random draws for the sake of clarity.
[overview fig, correlations for all models]
Akhtar, Nameera, Malinda Carpenter, and Michael Tomasello. 1996. “The Role of Discourse Novelty in Early Word Learning.” Child Development 67 (2). Wiley Online Library: 635–45.
Bürkner, Paul-Christian. 2017. “brms: An R Package for Bayesian Multilevel Models Using Stan.” Journal of Statistical Software 80 (1): 1–28. doi:10.18637/jss.v080.i01.
Clark, Eve V. 1987. “The Principle of Contrast: A Constraint on Language Acquisition.” Lawrence Erlbaum Associates, Inc.
Diesendruck, Gil, Lori Markson, Nameera Akhtar, and Ayelet Reudor. 2004. “Two-Year-Olds’ Sensitivity to Speakers’ Intent: An Alternative Account of Samuelson and Smith.” Developmental Science 7 (1). Wiley Online Library: 33–41.
Frank, Michael C, and Noah D Goodman. 2012. “Predicting Pragmatic Reasoning in Language Games.” Science 336 (6084). American Association for the Advancement of Science: 998–98.
Frank, Michael C, Elise Sugarman, Alexandra C Horowitz, Molly L Lewis, and Daniel Yurovsky. 2016. “Using Tablets to Collect Data from Young Children.” Journal of Cognition and Development 17 (1). Taylor & Francis: 1–17.
Goodman, Noah D, and Andreas Stuhlmüller. 2014. “The design and implementation of probabilistic programming languages.” http://dippl.org.
Kuperman, Victor, Hans Stadthagen-Gonzalez, and Marc Brysbaert. 2012. “Age-of-Acquisition Ratings for 30,000 English Words.” Behavior Research Methods 44 (4). Springer: 978–90.
Lewis, Molly L, Veronica Cristiano, Brenden M. Lake, Tammy Kwan, and Michael C Frank. 2020. “The Role of Developmental Change and Linguistic Experience in the Mutual Exclusivity Effect.” Cognition 198: 104191.
Markman, Ellen M, and Gwyn F Wachtel. 1988. “Children’s Use of Mutual Exclusivity to Constrain the Meanings of Words.” Cognitive Psychology 20 (2). Elsevier: 121–57.
Morey, Richard D., and Jeffrey N. Rouder. 2018. BayesFactor: Computation of Bayes Factors for Common Designs. https://CRAN.R-project.org/package=BayesFactor.